Skip to content

[SPARK-57555][FOLLOWUP][SQL] Add escape-hatch conf to keep legacy JDBC TIME-to-timestamp mapping#56884

Open
cloud-fan wants to merge 2 commits into
apache:masterfrom
cloud-fan:dw-jdbc-time-type-legacy-conf
Open

[SPARK-57555][FOLLOWUP][SQL] Add escape-hatch conf to keep legacy JDBC TIME-to-timestamp mapping#56884
cloud-fan wants to merge 2 commits into
apache:masterfrom
cloud-fan:dw-jdbc-time-type-legacy-conf

Conversation

@cloud-fan

@cloud-fan cloud-fan commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

SPARK-57555 made the built-in JDBC data source read SQL TIME columns as TimeType (instead of the legacy TimestampType) when spark.sql.timeType.enabled is true. That is a schema-inference change for existing JDBC reads: a column that previously came back as TimestampType now comes back as TimeType, with no DDL change by the user.

This PR adds an internal escape-hatch conf, spark.sql.legacy.jdbc.timeMapping.enabled (default false), following the existing spark.sql.legacy.<db>.*Mapping.enabled family. When set to true, JDBC TIME columns keep the legacy TimestampType mapping even when spark.sql.timeType.enabled is true. It has no effect when spark.sql.timeType.enabled is false.

Why are the changes needed?

spark.sql.timeType.enabled is a single global switch that governs TIME support across all of Spark. Enabling it flips JDBC TIME reads from TimestampType to TimeType in one step, which can silently break workloads that depend on the old mapping (downstream casts, comparisons, stored schemas, serialization). The escape hatch lets such workloads opt back into the legacy JDBC behavior without having to keep the entire TIME type disabled.

Does this PR introduce any user-facing change?

No behavioral change by default. The new conf defaults to false, preserving SPARK-57555 behavior. When explicitly set to true (and spark.sql.timeType.enabled is true), JDBC TIME columns read as TimestampType as they did before SPARK-57555.

How was this patch tested?

Added a unit test in JDBCSuite asserting that, with spark.sql.timeType.enabled=true, the column reads as TimestampType when the escape hatch is on and as TimeType when it is off.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.8)

…C TIME-to-timestamp mapping

### What changes were proposed in this pull request?
SPARK-57555 made the built-in JDBC data source read SQL `TIME` columns as `TimeType` (instead of the legacy `TimestampType`) when `spark.sql.timeType.enabled` is true. That is a schema-inference change for existing JDBC reads: a column that previously came back as `TimestampType` now comes back as `TimeType`, with no DDL change by the user.

This PR adds an internal escape-hatch conf, `spark.sql.legacy.jdbc.timeMapping.enabled` (default `false`), following the existing `spark.sql.legacy.<db>.*Mapping.enabled` family. When set to true, JDBC `TIME` columns keep the legacy `TimestampType` mapping even when `spark.sql.timeType.enabled` is true. It has no effect when `spark.sql.timeType.enabled` is false.

### Why are the changes needed?
`spark.sql.timeType.enabled` is a single global switch that governs `TIME` support across all of Spark. Enabling it flips JDBC `TIME` reads from `TimestampType` to `TimeType` in one step, which can silently break workloads that depend on the old mapping (downstream casts, comparisons, stored schemas, serialization). The escape hatch lets such workloads opt back into the legacy JDBC behavior without having to keep the entire `TIME` type disabled.

### Does this PR introduce any user-facing change?
No behavioral change by default. The new conf defaults to `false`, preserving SPARK-57555 behavior. When explicitly set to `true` (and `spark.sql.timeType.enabled` is `true`), JDBC `TIME` columns read as `TimestampType` as they did before SPARK-57555.

### How was this patch tested?
Added a unit test in `JDBCSuite` asserting that, with `spark.sql.timeType.enabled=true`, the column reads as `TimestampType` when the escape hatch is on and as `TimeType` when it is off.

### Was this patch authored or co-authored using generative AI tooling?
No
@cloud-fan

Copy link
Copy Markdown
Contributor Author

cc @MaxGekk

@MaxGekk

MaxGekk commented Jun 30, 2026

Copy link
Copy Markdown
Member

I reviewed the change and didn't find correctness issues.

The config semantics look right to me: JDBC TIME maps to TimeType only when spark.sql.timeType.enabled=true and the new legacy escape hatch is false; when the escape hatch is true it falls back to the existing timestamp mapping path, preserving preferTimestampNTZ behavior through getTimestampType(isTimestampNTZ).

I also checked that dialect-specific mappings such as PostgreSQL timetz and H2 TIME_WITH_TIMEZONE are not affected by this default mapper change, which seems intentional. The .version("4.3.0") also looks consistent with the expected branch policy for this change.

Only minor note: the new test does not explicitly cover spark.sql.timeType.enabled=false together with the escape hatch enabled, but the existing legacy TIME test already covers the disabled-TIME path and the code path is straightforward. So this is not a blocker from my side.

…mapping conf and cover disabled-TIME case

Add `.withBindingPolicy(ConfigBindingPolicy.SESSION)` to the new
`spark.sql.legacy.jdbc.timeMapping.enabled` conf so it passes
`SparkConfigBindingPolicySuite`, and add a test case asserting the escape
hatch has no effect when `spark.sql.timeType.enabled` is false.

Co-authored-by: Isaac

@shrirangmhalgi shrirangmhalgi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the followup @cloud-fan. LGTM - the escape-hatch semantics are correct: isTimeTypeEnabled && !legacyJdbcTimeMappingEnabled preserves the existing behavior when either flag disables TimeType. Verified the 3-way test matrix covers all combinations.

Note: per-dialect getCatalystType overrides run before the default mapper (JdbcUtils.getSchema line 335), so they need their own legacyJdbcTimeMappingEnabled check for the escape hatch to apply. I'll include this in #56904 (PostgreSQL TIME dialect) and other followup PRs once this PR merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants